Characteristics of a Production Parallel Scienti c Workload
نویسندگان
چکیده
Multiprocessors have permitted astounding increases in computational performance but many cannot meet the intense I O requirements of some scienti c applications An important component of any solution to this I O bottleneck is a parallel le system that can provide high bandwidth access to tremendous amounts of data in parallel to hundreds or thousands of processors Most successful systems are based on a solid understanding of the expected workload but thus far there have been no comprehensive workload characterizations of multiprocessor le systems This paper presents the results of a three week tracing study in which all le related activity on a massively parallel computer was recorded Our instrumentation di ers from pre vious e orts in that it collects information about every I O request and about the mix of jobs running in a production environment We also present the results of a trace driven caching simulation and recommendations for designers of multiprocessor le systems
منابع مشابه
Characteristics of Parallel Scienti c Workloads
Phenomenal improvements in the computational performance of multiprocessors have not been matched by comparable gains in I O system performance This imbal ance has resulted in I O becoming a signi cant bottleneck for many scienti c applica tions One key to overcoming this bottleneck is improving the performance of parallel le systems The design of a high performance parallel le system requires ...
متن کاملThe Galley Parallel File System Nils
Most current multiprocessor le systems are designed to use multiple disks in parallel using the high aggregate bandwidth to meet the growing I O requirements of parallel scienti c applications Many multiprocessor le systems provide applications with a conventional Unix like interface allowing the application to access multiple disks transparently This interface conceals the parallelism within t...
متن کاملA Hypergraph-Based Workload Partitioning Strategy for Parallel Data Aggregation
This paper presents an algorithm to e ciently carry out data aggregation operations on large disk-based datasets on a parallel machine. This algorithm employs a hypergraph formulation for partitioning the workload among processors. Data aggregation is a common operation executed by applications that explore and analyze very large multi-dimensional scienti c datasets. A data element in these dat...
متن کاملA Comparison of Workload Traces from Two Production Parallel Machines
The analysis of workload traces from real production parallel machines can aid a wide variety of parallel processing research, providing a realistic basis for experimentation in the management of resources over an entire workload. We analyze a ve-month workload trace of an Intel Paragon machine supporting a production parallel workload at the San Diego Supercomputer Center (SDSC), comparing and...
متن کاملGalley: a New Parallel File System for Scientiic Workloads
Most current multiprocessor le systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/O requirements of parallel scienti c applications. Most multiprocessor le systems provide applications with a conventional Unix-like interface, allowing the application to access those multiple disks transparently. This interface conceals the parallelis...
متن کامل